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ABSTRACT 

An extension of the methods of path analysis to 
include studies of categorical data was described and exemplified, in 
a causal study of- college dropout s. -The usual models -apd method-s of 
causal (path) analysis were designed for the study of quantitative 
variables an^ are jio* appropriate when the variables under 
in vestiga*-ion are categorical. ^xte.nsions of the loglinear analysis 
<A ccntinaency tables to- include cases with a specified order of ^ 
priority for variables in\a causal model- is used to expand the ra^nge 
of theoretical questions -that educatijotial researchers can addtessl 
The example presented in this documeivt concerns the- problem of 
withdrawal from institutions of higher education using data from the 
National longitudinal Studyv The model related a r« spohdent • s race 
.and ability, two exogenous variables^ to postsecondary grades and 
both grades and ability to the dropout variable, .The results support 
the proposed causal model and indicate that- the effects of ability 
and. grades are. independent of race," (Author/CTflV 
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path' ANALYSIS WITH CATEGORICAL DATAi APPLICATIONS TO EDUCATION 

^ - i . . 

Lee M. Wolflfe 



The usual models and^methods of causal analysis (see, e.g., Wright, 
1921; Duncan, 1966, 1975) were designed for the study of quantitative 
(continuous) variables. However, the usual methods of path analys:fs are 
not appropriate when the variables under Investigation *are categorical. ^ 
Recent advances In the logllnear analysis of contingency table* (see, e.g.. 
Bishop, Flenbeitg and Holland, 1975) have been extended by Goodman (19/2, 
1973,*t^79) to Include cases with a specified order 6f priority for 
variables Irt^ jcaus^al model. ^ 

^ One of the salleTit features of- patfl anal>;s1s Is thlt the formulation 
of catisal models forces the reseJircher'to express Ideas in expll pit form, 
and allows one to read;. unambiguously the Ideas of others (Wolfle, forth- 
coming), Goodman's analog to quantitative path analysis also creates 
dleigramatlc representations of causal effects, wMch aVe estimated with 
logllnear, or logit models. While the method of estljratlon necessarily ' 

■ ' '■■ V 

varies between quantitative and qualitative path arw lysis, both systems 
fflrce a degree of explititness desirable in soc;lal ^sc1ence documents. «^ 

To 11 lustra t^^an^ducational application of patti analysis with 
'categorical data, consideration will be giNren to the problem of withdrawal^, 
fy-om institutions ^f M^gheV education. T1n](bO (1975) has sieges ted after 



1 

tha 



Durkheim, 1951) that dropouts are more likely to be -individuals 
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A reviewer of an earlier version gf pis paper incdrrectly noted 
"thgt path, analysis with categorical data coufd'be achieved through LlSREL- 
(Joreskog and Sorbom, t978)^ Au cdntra^re, l.iSREL (like all regression 
or correlation procedures) 'requires the inpu ;. (or computation) qf a 
varlance-covariance matrix; because such matrices presupgo&e continuou^s • • 
data, LISREL is n6t appropriate >^hen the datji ard categorical. 



'insufficiently Integrated Into the fabric of the social system. Consequently, 
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•students 1n Institutions of higher education are more likely to dropout . 
whp^i they fall to becomg Integrated Into the relatively meritocratic reward ^ 
structure. Specifically, the modeT proposed here relates a respondent's 

♦ 

race and ability (two exogenous variables) to postsecondary grades, hence . 
to the dropout variable. The assumed model Is shown In Figure 1. 



RACE -r- ■■ = ^ GRADES 




ABILITY^— ^WITHDRAWAL, , 

( ' - V 

Figure 1. Conceptual Model of Col lege >H^thdrawal 

' ■ V . ■ ' • • • 

t The model specifies that race and ability are associated, but for 
reasons unanalyzed 1n the present model; hence, these variables are 
exogenous. Both race and ability are seen as additive causes of college 
grades (a manifest measure of Integration Into the social system). To 
explain withdrawal from college, ability and gr-ades are hypothesized to 

♦ 

be additive cau^^s of w1thd»*awal, but the model specifies no additive 
effect of race once ability and grades iiave been controlled for. 

» The data used In the estimation of this model were taken from the 
National Longitudinal Study^ df^e 1972 high school graduating class (see 
Levlnsohn.et al., 1978). The sample Included In this analysis represents 
th^t portion of the 1972 high school graduating class which went on to 
attend In the fall of 1972 either i two-year or four-year college or 

^ This may be analagous to suicide In Durkheim's (1951) analysis. 



university. Since the Initial survey In 1972, the nesp^dents have been 

• i 
resurveyed three times. In 1973, ^74, and 1976, At each fbllowup Isurvey 
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respondents were asked If they were In school, and If not, had they 
graduated. Dropouts were defined as those people who entered a two or 
four-year college or university 1n the fall of 1972, who subsequently 
reported themselves not to be, attending %choor, but did not graduate, and 
had not re-entered school at the time of the last survey In 1976. Nondropouts 
were defined as people who had attended school continuously since graduating 
from high school, or had graduated. 

' Three variables were considered causal antecedents of college 
withdrawal. These were respondent's race, ability, and college grades*; 
Race was measured by a dlchotomout-^rlable which classified whites into one 
group, and blacks, Mexican-Americans, Peurto Ricans, and Amerltan Indians 
lifto a second group. Ability was a trichotomous variable calculated from 
a scaled ability te^t administered wh11e-tf\e respondents were 1n their senior, 
year of high school; the respondents were classified according to whether 
they fell into the lower quarts le, the middle two quartzes, or the upper • 

• quartlle Xif the test-score distribution. Grades were determined by respon- 
dent's self- report .and were catec[or1zed into four groups: mostly A's, 
mostly B's, mostly Us, or mostly D's or F's. The question on grades was 

^ repeated 1n each -foil owup survey. For i^ondropouts the answer to the 1976 



survey was used to classify th« 1respondents\ F^drbpquts the answer to the 
last survey before dropping out was used (because these respondents would 
havt been routed around the ^rade question 1n subsequent surveys). Thus, 
the grades for. dropouts could have been taken from either the 1973, 1974, ^ 
or 1976 survey. 



These data are shown 1n Table 1. which cross-classifies 4392 respondents 
according to their distribution among the variables already defined. Note 
that these numbers are frequency counts, and that Table 1 is merely a 
2x3x4x2 contingency table. 

To determine whethei^ the causal model specified above is c!|pngruent 
with the observed data, the observed frequencies In the table are compared 
with the frequencies estimated by the model, and an assessment Is made of 
ythe goodness of fit using the likelihood ratio chl-square statistic. The 
exppcted frequencies ar« estimated using the logllnear approach (See, e.g.,- 
Bishop, Flenberg and Holland, 1^75; Flenberg; 1977). * 

* Goodman (1973, 1979) suggests that a model such as the one Illustrated 
In Figure 1 can be estimated by taking the marginal distribution of race 
and ability as given, and testing for the links between grades and race, 
and gfades and ability. The second part of .the model assumes the re.lat1on- 
sMp between race, ability and grades, and tests the links between withdrawal 
and grades, and withdrawal and ability. / 

Specifically, for ihe model with grades as the response variable, and 
race and ability as explanatory,, one compares the [RA][RG][AG] model to the 
completely saturated model; If ty [R/l>][RG][AG] model fits just as well as 
•the saturated model, one concludes the effects of race and ability on 
grades are additive (I.e., no Interaction among race, ability and grades). 
The results of this test are shown In the upper*panel of Table 2. The 
letters define the variables as shown In table 1. The [RA][R6][A6] model 
has a llJcellhood ratio chl-square statistic of 13.63 with 6 degrees of * 
freedom, Compared to the satyrate^ model ( « o.O, with df » 0), one 
sees that the CRA][RG]CAG] model fits the data reasonably we\l; the expected 
frequencies under .the CRA]CRG][A6] model Jlo not differ significantly at^ r 



the .01 level fronfthe observed frequencies. One would conclude at this 
level of probability. that the effects of race and ability on gradfes are 
additive. Comparvisons of the mpdels [RA][AG] and [RA][RG] to [RA][RG][AG] 
(note that the [RA] marginal effect is never removed from the model 
because it is a given association as far as grades is concerned) reveali 
that the removal of the effect of either race or abilit^on grades 
seriously erodes the model's ability to reproduce the observed frequencies. 
'That is, both race and ability independently contribute to the explanation 
of the distribution of grades. 

The model with college withdrawal as the respoifce variable, and race, 
ability and grades as explanatory, specifies the most appropriate model 
to be [RAG] [AD] [60]. That is, the association of race, ability aod grades 
is taken as given, andWithdrawal depends upon the additive effects of * 
ability and^ grades. To determine if, in fact, the effects of race, ability 
and grades on withdrawal are additive, one compares the [RAG][RD][AD]CGD] 
model to the interactive model [RAG] [RAD] [RGD] [AGO]. ' These results are' 
shown in the lower panel of Table 2, in lines (4) and (5). The likelihood 
ratio chi -square statistic may be partitioned, and the presence of inter- 
active associations may be determined by testing the difference between 
the chi-squares. The partitioned chi-square is 23.92 with 11 degrees of 
freedom; this value is not significant at the .01 level of probability, and 
one conctudes the effects of* race, ability and grades on withdrawal are 
addltfi/Q. 

To determine if the effect of race on withdrawal is negligible once 
ability and grades are accounted for, one compares the [RA6][AD][GD] mod^l 
to the*CRAG][RD][AD][6D] model. These results are shown In the lower panel 
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of Table 2 1n lines (5) and (6),, and reveal th^ the [RAG]I:ad][GD] tnodeV fits 
the da^a very well Indeed compared to the model with an [RD] maralnalefflbt. 
Thus, the hypotf\es1s of no race effect on vithdrawal 1s confirmed. Finally, 

lines (7) and (8), compared with line (6) show that both 'Ability and grades 

( ^ ' 

have Important additive effects on withdrawal; dropping either marginal 

'/ 

effect seriously erodes the model's ability to reproduce the observed ^ 
frequencies 5/ ' ^ 

'^conclusion., withdrawal from colleges and universities depends up^n 
the additive effects of ability and grades, but not race. If "we take grades 
to be a manifest measyre of Integration Into a meritocratic reward structure, 
we see that those people Insufficiently Integrated Into the system are those 
most likely to withdraw from the system. Moreover, those pepple with less 
ability are also those more likely to withdraw. These effects of ability 
and grades are Independent of race.' 

Causal analyses with quantitative variables have become a useful means n 
of understanding educational phenomena. Further extensions of causal 
modeling technique;; to Include categorical variables will expand 'the -range 
of theoretical questions educational researckfers are able\o address. For 
those Interested In learning more aboQt the logllnear analyses of contingency 
tables, the texts ofi-Flenberg (1977) and Bishop, Flenberg and Holland (1975) 
are recommended. The papers by Goodman (1973, 1979) explicate models wHIch 
assume underlying causal priorities among the variables. ^ These references 
wl^T serve to explain how to analyze multivariate contingency tables. 
Hopefully, this paper has exemplified -to the educational community the 
utility of these causal methods. Theoretically Interesting questions which 



depended upon the analysis of categorical data were heretofore unanswerable. 
These new methods ^pen new '^lvenues for the development and testing of 
theory In education. 




TABLE 1 

Observed Cross-Classification of 4392 High School Graduates 
1972 Who Attended Two or Four Year Colleges, t^y College Grades, 

jVblllty, and Race . 



(R) 

Race 


(A) . 
Ability 


Grades 


Dropout 
Yes 


• 

No" 




✓ 

White 


Low 


A 


15 ' 


9 


( 






B 


22 


14 








C 


85 


56 








D or F 


12 


1 ■ 






Medium 


A 


103 


. 234 








B 


127 


333 








C 


270 


368 








■ * D or F 


43 


9 






High 


' A 


140 


806 




« 




B 


106 


486 








C 


157 • 


331 


r 
I 


<* 


• 


D or F 


27 


6 


• other ' 


Low 


A 


12 


S3' 










15 


27 




• 




C 


\ 75 


68 








D or F 


13 


4 






Medl urn 


A ^ 


17 


29 








B 


27 ^ 


56 








C 


65 


95 








D or F 


14 • 


2 






^H1gh 


A 


5 


3§ 








8 


/ 3 


18 






• 


c 


14 " 


25 




* 

♦ 




D or F 


0 


0 ^ 








* 


• 





TABLE 2 . ■ . 

Likelihood Ratio Chi-S^uare Values for Some Loglinear Models Applied 

to the Data in Table 1 





Mode 


1 Fitted Marginal^ 


Degrees of 
Freedom 


Likelihood 

Ratio 
Chi -Square 




s 






GRADES 


• 

\ 

• 


t 


• • 


• (1) 


[RA][RGKAG] 


/ 6 


13.63 


(2) 


CRA][AG] 


30.01 


(3) 


\ rRAlfRGl 

4 

% 


\ 12 


427.82 


WITHDRAWAL 




1 

« 




(4) 


CRAG][RAD][RGD][A6D] 


6 


4.^6 


(5) 


CRA6][RD]CAD][GD] 


17 


28.48 


(6) 


CRA6]CAD][GD] 


18 . 


.29.48 


" ' (7) 


[RAG] [AD] 


21 


^4o.85 


(8) 


CRAG][GD] 


20 


157.86 




REFERENCES 



Bishop, Yvonne M. M., Stephen E. Flenberg, and Paul W. Holland 
1975 Discrete Multivariate' Analysis: Theory and Practice. 
Cambridge: The MIT Press. 

Duncan, Otis Dudley 
1966 "Path analysis: sociological examples." American JournaVof 
Sociology 72: 1-16. 

1975 Introduction to Structural Equation Models. New York: 
Academic Press. 

Durkhelm, Emile 

1951 Suicide: A Study In Sociology. Translated by John A'. 

Spaulding and George Simpson. New York: The Free Press. 

flenberg, Stephen E. 

1977 The Analysis of Cross-Classified Categorical Data. 
Campbrldge: The MIT Press. . y - ^ • 

Goodman, Leo A. ' 

1972 "A general model for the analysis of surveys." Aroerlcaji^ 
Journal of Sociology 77: 1035-1086. 

1973 "Causal analysis of data from panel studies and other kinds 
of surveys."^ American Journal of Sociology 78:- 1135-1191 .^ 

1979 "A brief guide to the causal analysis of data from surveys." 
American Journal of Soclolooy 84: 1078-1095. 

Jbreskog, KaVl G., and Dag Sorbom 

1978 LISREL: Analysis of Linear "Structural Relationships by the 
Method of Maximum Likelihood, User's Guide. Chicago: / 
National Educational Resources, Inc. 

Levlns'ohn, Jay R., Louise B. Henderson, John A. Rl'ccoborio, and R. Paul Moore 
1978 National Longitudinal Study: Base Year, First, Second and. 

Third Follow-up Data File Users Manual. Washington, D.C^. 
National Center for Education Statistics. 

Tinto, Vincent * , 

1975 "Dropout from higher education: a theoretical synthesis of * 
recent research."- Review of Educational Research 45: 89-125. 

Wolfle, Lee M. ' 
forth- "Strategies of path analysis." American Educational Research 
coming Journal. * . 

Wrlgfit, Sewell 

1921 "Correlation and causation," Journal of Agrloultural 
Research 20: 557^585. 



